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Do CRM Systems Cause One-to-One 
Marketing Effectiveness? 

Sunil Mithas, Daniel Almirall and M. S. Krishnan 



Abstract. This article provides an assessment of the causal effect of 
customer relationship management (CRM) applications on one-to-one 
marketing effectiveness. We use a potential outcomes based propensity 
score approach to assess this causal effect. We find that firms using 
CRM systems have greater levels of one-to-one marketing effectiveness. 
We discuss the strengths and challenges of using the propensity score 
approach to design and execute CRM related observational studies. We 
also discuss the applicability of the framework in this paper to study 
typical causal questions in business and electronic commerce research at 
the firm, individual and economy levels, and to clarify the assumptions 
that researchers must make to infer causality from observational data. 

Key words and phrases: Causal analysis, potential outcomes, propen- 
sity score, matching estimator, customer relationship management sys- 
tems, electronic commerce. 



1. INTRODUCTION 

Electronic commerce, information systems and mar- 
keting researchers widely agree on the importance 
of causal analysis to gain a better understanding of 
the causal effects of managerial interventions and 
marketing programs (Boulding, Staelin, Ehret and 
Johnston, 2005; Gregor, 2006; Lucas, 1975). Does 
implementing an information technology (IT) in- 
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tervention, such as a customer relationship man- 
agement (CRM) system, improve firm performance 
(Brynjolfsson and Hitt, 1998; Mithas, Krishnan and 
Fornell, 2005; Srinivasan and Moorman, 2005)? Do 
managerial interventions, such as customer satisfac- 
tion improvement programs, add to shareholder 
wealth (Fornell, Mithas, Morgeson and Krishnan, 
2006; Peppers, Rogers and Dorf, 1999; Rust and 
Kannan, 2003)? Do certain auction parameters max- 
imize seller or consumer surplus (Bapna, Goes and 
Gupta, 2001; Bapna, Jank and Shmueli, 2005; Kop- 
pius and Van Heck, 2002; Mithas and Jones, 2006)? 
Does acquiring an MBA degree cause an increase 
in the salary of an IT professional (Connolly, 2003; 
Mithas and Krishnan, 2004b)? Has "offshoring" 
caused a decline in the jobs and wages in the U.S. 
economy (Mithas and Whitaker, 2006; Venkatraman, 
2004)? The common theme across these questions — 
at the firm, individual and economy levels — is that 
they all are causal questions posed by managers, in- 
dividuals and policy makers about the causal effects 
of some intervention. 

Historically, with their early emphasis on individ- 
ual level phenomena, such as the impact of mode 
of information presentation on learning and perfor- 
mance (Lucas and Nielsen, 1980), electronic com- 
merce researchers answered causal questions using 
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an experimental approach that permitted randomi- 
zation — the gold standard for assessing causality. 
As researchers increasingly focus on firm level phe- 
nomena, particularly the business value of IT in- 
terventions, they must rely on observational data, 
because randomized field trials across actual firms 
are virtually impossible. In addition, the growing 
digitization of business processes is providing un- 
precedented opportunities to collect richer obser- 
vational data (more for individual level behavior 
than for firm level research), making it easier to 
pose new questions that researchers did not or could 
not ask before (Jank and Shmueli, 2006). However, 
the use of observational (nonexperimental) data in 
these settings raises concerns regarding the ability 
to interpret causal results from empirical analyses 
(Rosenbaum, 1999). What exactly is the problem 
with using observational data? What benefits does 
randomization provide in the experimental setting 
and what assumptions are needed to make causal 
claims using nonexperimental data? Can statistical 
methods help answer questions of causality? The po- 
tential outcomes framework for causal inference, de- 
scribed in more detail by Rubin (1974) and Holland 
(1986), sheds light on these questions and provides 
researchers with tools to answer some of the ques- 
tions posed in this Introduction. 

In this paper, we point to the usefulness of a po- 
tential outcomes approach, called propensity score 
stratification (Rosenbaum and Rubin, 1983b), to 
investigate causal relationships in the substantive 
domain of electronic commerce research. For further 
discussion of the potential outcomes approach, refer 
to Angrist and Krueger (1999), Heckman (2005), 
Imbens (2004), Rosenbaum (2002), Rubin (2005) 
and Winship and Morgan (1999). We address a prob- 
lem in the electronic commerce domain and esti- 
mate the causal effect of CRM systems on one-to- 
one marketing effectiveness. The approach relies on 
the assumption of strong ignorability, which implies 
that assignment to a treatment group is independent 
of potential outcomes conditional on observed pre- 
treatment covariates (Rosenbaum and Rubin, 1983b) 

2. CRM SYSTEMS AND ONE-TO-ONE 
MARKETING EFFECTIVENESS 

Firms invest over $50 billion each year on IT appli- 
cations (such as CRM systems) to streamline 
customer-interfacing business processes. A primary 
objective of these systems is to improve one-to-one 



marketing effectiveness, that is, the ability of a firm 
to target an individual customer based on previous 
history and purchasing behavior. However, media 
reports question whether these CRM implementa- 
tions have paid off (Harvard Management Update, 
2000), and from a business value perspective (Banker, 
Kauffman and Mahmood, 1993; Barua and Mukhopad- 
hyay, 2000; Brynjolfsson and Hitt, 1998; Kauffman 
and Weill, 1989; Sambamurthy, Bharadwaj and Grover, 
2003), there is a need to estimate whether CRM sys- 
tems have indeed caused an improvement in one-to- 
one marketing effectiveness. 

We obtained data for this study from Informa- 
tionWeek magazine. The data include information 
about firms' IT systems and related business ben- 
efits for the year 1999, and were collected by In- 
formationWeek between late 1999 and early 2000 
as part of a more comprehensive survey to bench- 
mark firms' IT infrastructure and managerial prac- 
tices in their respective industries. InformationWeek 
has been surveying top IT managers (including Vice 
Presidents, Chief Information Officers and Direc- 
tors) of large firms in the United States since 1986 
to identify the firms that are the best users of IT. 
InformationWeek surveys are typically sent to very 
large firms such as Fortune 1000 firms. Because In- 
formationWeek has a significant presence in the IT 
business community and offers visibility and incen- 
tives to participating firms, we believe that the high 
response rate for these surveys compares favorably 
to other firm level academic studies. Information- 
Week is considered to be a reliable source of in- 
formation, and previous academic studies have also 
used data from InformationWeek surveys (Mithas, 
Krishnan and Fornell, 2005; Rai, Patnayakuni and 
Patnayakuni, 1997). Our sample consists of 487 firms. 

We now specify the effect we wish to estimate by 
using the language of potential outcomes (Rubin, 
2005). Let t denote a firm's CRM status: t = I ii 
a firm adopts CRM and t = if a firm does not 
adopt CRM. Let the potential outcome Y{t) denote 
a firm's assessment of its one-to-one marketing effec- 
tiveness from its customer related IT applications. 
For a fixed t, Y{t) = 1 if the firm is effective in one- 
to-one marketing and Y{t) = if a firm is not ef- 
fective. Observe that each firm has two potential 
response values: Y{t = 1) is the firm's effectiveness 
had the firm adopted CRM and Y{t = 0) is the firm's 
effectiveness had the firm not adopted CRM. Causal 
effects are defined as contrasts in these two quanti- 
ties. 
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Using this notation, we denote the average causal 
effect A of CRM on Y as the difference in the pro- 
portion of firms with effective one-to-one marketing 
programs had all firms adopted CRM versus the 
proportion with effective programs had none of the 
firms adopted CRM. The average causal effect A 
describes the change in the proportion of companies 
with effective marketing programs that is caused by 
CRM. Note that lowercase t is not a random vari- 
able, but is instead conceived as an index of the 
response Y{t) which is assumed to exist for every 
firm in our study. In other words, neither t nor Y(t) 
represents observed data. The observed data analogs 
of t and Y{t) are represented by the random vari- 
ables uppercase Z and Y, respectively. That is, we 
denote a firm's observed CRM status using upper- 
case Z. In this study, 282 of the 487 firms adopted 
a CRM system. The observed outcome variable Y 
is a firm's assessment of the improvement in one- 
to-one marketing effectiveness from its customer re- 
lated IT applications. This observed response is bi- 
nary (1 indicates that a firm reported an increase in 
one-to-one marketing effectiveness; indicates that 
a firm did not report an increase in one-to-one mar- 
keting effectiveness). Note that the observed one-to- 
one marketing effectiveness is only one of the two 
potential outcomes, because we cannot observe the 
same firm both with CRM and without CRM. 

Finally, the three- vector X is a set of observed pre- 
CRM characteristics, including (1) MFC — a firm's 
industry sector (manufacturing or services firm), 
(2) ITPC — a firm's amount of IT investment as a 
percentage of revenue and (3) CUSTAPPS — the pres- 
ence of other customer related IT systems (this 13- 
item scale indicates deployment of IT systems to 



support the business processes involved in customer 
acquisition and disposal of products and services 
offered by firms, including product marketing infor- 
mation, multilingual communication, personalized 
marketing offerings, dealer locator, product config- 
uration, price negotiation, personalization, transac- 
tion system, on-line distribution and fulfillment sys- 
tem, customer service and customer satisfaction track- 
ing) at the firm prior to CRM adoption. Table 1 
shows summary statistics for the treatment (CRM) 
and control (non-CRM) groups. 

A naive estimate of A is the difference in pro- 
portion, which we denote by D, of observed CRM 
and observed non-CRM firms that report increases 
in one-to-one marketing effectiveness, that is, D = 
E{Y\Z = 1) — E{Y\Z = 0), which in our sample is 
equal to 0.30 [SE = 0.04, 95% CI = (0.22, 0.39)] . How- 
ever, it is problematic to attribute the causal effect 
A to the observed difference in the proportion of 
CRM versus non-CRM firms that report increases in 
one-to-one marketing effectiveness. This is because 
we do not observe the performance of firms with 
CRM had they not adopted CRM and vice versa, 
and because firms that adopt CRM systems may be 
different in ways related to performance outcomes 
(i.e., confounding or selection bias) (Rosenbaum and 
Rubin, 1983b). Thus, we may have a selection prob- 
lem; that is, in symbols, D = A + Bias, where Bias is 
the extent to which the CRM and non-CRM groups 
differ according to pre-treatment variables (observed 
or unobserved) that also predict Y . 

In an experimental setting where firms are ran- 
domly assigned to either a treatment or control group, 
these selection problems would not occur. The act 



Table 1 

Characteristics of treatment and control groups before matching 



Non-CRM" (control group) CRM" (treatment group) ^g^jj difference 



Outcome variable 



Improvement in one-to-one 


0.37 


0.67 


0.30 


marketing effectiveness 


(0.48) 


(0.47) 


(<0.01) 


Observed covariates 








Customer-facing IT systems (CUSTAPPS) 


6.77 


8.24 


1.47 




(2.51) 


(2.79) 


(<0.01) 


IT investments (ITPC) 


2.91 


3.59 


0.68 




(2.43) 


(3.53) 


(<0.01) 


Manufacturing (MFC) 


0.45 


0.34 


-0.11 




(0.50) 


(0.48) 


(<0.01) 



"The mean and standard deviation (in parentheses) are given. 
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of randomizing CRM in an experimental setting, 
for example, would render treatment (CRM) and 
control (non-CRM) groups equal and balanced on 
average (both in observed and unobserved charac- 
teristics). That is, Bias = on average. The treat- 
ment effect can then be assessed by comparing the 
mean outcomes of treatment and control groups at 
a given time after the treatment is assigned. In con- 
trast, in a nonexperimental setting such as actual 
firms, CRM status may be affected by one or more 
of a firm's observed and/or unobserved characteris- 
tics. For example, the use of CRM systems may be 
due to self-selection by managers or may be man- 
dated by industry consortiums or business partners 
(Mithas, Krishnan and Fornell, 2005). Comparing 
the mean outcomes in observational studies may 
therefore overestimate or underestimate the true 
causal effect of CRM implementation, and even a 
larger sample size would not remedy these selection 
problems. 

3. A PROPENSITY SCORE APPROACH TO 
ESTIMATE CAUSAL EFFECT 

We begin by checking the differences between CRM 
and non-CRM firms on observed pre-CRM covari- 
ates. Figure 1 shows the distribution of observed 
covariates across CRM and non-CRM firms. As Fig- 
ure 1 and Table 1 show, compared to non-CRM 



firms, CRM firms have more customer-facing IT sys- 
tems, invest more in IT as a percentage of revenues 
and are underrepresented in the manufacturing sec- 
tor. We conducted a more formal analysis of the se- 
lection into CRM status using a probit model with 
CRM status as the dependent variable and observed 
covariates as explanatory variables. The chi-square 
test in the probit model reveals that the selection 
model is significant compared to a model with no 
explanatory variables. Thus, CRM firms differ sig- 
nificantly from non-CRM firms with respect to ob- 
servable covariates in the probit model. Character- 
istics such as prior investments in customer-facing 
IT systems and a firm's industry sector significantly 
affect the probability of adopting CRM. Because it 
is likely that observed pretreatment covariates are 
also associated with observed one-to-one marketing 
effectiveness, this analysis suggests that Bias may 
be nonzero. 

The following estimator of A relies on the assump- 
tion of strong ignor ability (Rosenbaum and Rubin, 
1983b). This assumption says that selection bias 
is due only to correlation between observed firm 
characteristics and a firm's treatment status, where 
treatment refers to the implementation or absence of 
CRM systems. Formally, the assumption states that 
the distribution of the observed CRM status does 
not depend on the potential outcomes Y{0) and Y{1) 
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Fig. 1. 



Covariate balance before stratification. 
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given the observed covariates. We have ah'eady shown 
that there exists some evidence of selection based on 
observables. By making the strong ignorability as- 
sumption, we are saying that there exists no unmea- 
sured or unknown pre-treatment variable (say U) 
that is correlated directly with both CRM status 
and the outcome variable one-to-one marketing ef- 
fectiveness. [A firm's market orientation (Kohli and 
Jaworski, 1990) is an example of one such unmea- 
sured covariate U that may violate this assumption, 
because market orientation affects CRM adoption 
and may also have a direct effect on customer re- 
lated outcomes such as one-to-one marketing effec- 
tiveness. Other unmeasured pre-treatment covari- 
ates may exist, including variables not yet known to 
electronic commerce researchers. A procedure with 
R code that describes the sensitivity of our results 
according to violations of this assumption is avail- 
able from the authors on request.] 

Under the assumption of strong ignorability, match- 
ing on the basis of observed covariates can be used to 
overcome the selection bias problem. With match- 
ing, the basic idea is to find a non-CRM firm (i.e., 
a "clone" in the sense of Rubin and Waterman, 2006) 
for every CRM firm such that the firms do not dif- 
fer in any way other than their CRM adoption sta- 
tus. Although not a serious issue in our CRM ap- 
plication that has only three covariates, multivari- 
ate matching is usually problematic because sample 
sizes are often not large enough to achieve match- 
ing on all observed covariates. This problem (known 
as the curse of dimensionality) becomes particularly 
severe if the covariates are of a continuous nature. 
Extending the previous work of Rubin (1977) that 
shows the efficacy of balancing on a single covari- 
ate, Rosenbaum and Rubin (1983b) proposed an in- 
genious solution to this problem. Their idea is to 
match on the propensity score instead. The propen- 
sity score is defined as the probability of being treated 
given observed covariates — which, in our setting, 
translates to the probability that a firm has adopted 
CRM given the covariates. We denote the propen- 
sity score by e(x) = Pr(Z = 1\X = x). 

We apply a matching technique based on the es- 
timated value of propensity scores e(x), a function 
of the observed pre-CRM characteristics. This ap- 
proach, also known as propensity score stratifica- 
tion, relies on forming subclasses (or strata) S{x) 
based on the propensity scores (Rosenbaum and Ru- 
bin, 1984). It can be seen as a way to further reduce 
the dimensionality of observed covariates [i.e., X is 



reduced to e{x), which is reduced to the subclasses 
^(x)]. Specifically, the objective of sub classification 
is to create subclasses based on the propensity score 
so that CRM and non-CRM firms have similar val- 
ues of the propensity score (thereby achieving bal- 
ance on the multivariate X). This enables a fair 
comparison (on Y) of CRM and non-CRM firms 
within each subclass. Then, once an appropriate level 
of balance within each subclass is achieved [e.g., be- 
tween e{x) = 0.2 and e{x) = 0.4], an estimate of A 
can be obtained by taking a weighted average of the 
within strata effects. Rosenbaum and Rubin (1983b) 
show that it is sufficient to condition on the univari- 
ate propensity score e{x) [and thus S{x)] to remove 
bias due to multivariate X. To construct the sub- 
classes, we use the propensity scores based on the 
probit model mentioned above (Becker and Ichino, 
2002). (We also tried interaction and quadratic terms 
in our propensity score models. Because they did not 
affect the balance of covariates significantly across 
the CRM and non-CRM groups, we did not use them 
in our final analysis.) 

Because the matching estimators do not identify 
the treatment effect outside the region of common 
support, our first step is to calculate the range of 
support for both the CRM and non-CRM groups. 
In our study, the support for the CRM group is 
(0.22-0.89) and the support for the control group 
is (0.18-0.91). Following Rubin (2001), we dropped 
from our analysis six non-CRM firms that fell out- 
side the common support region (propensity scores 
for these firms are either less than 0.22 or more than 
0.89). As Rubin and Waterman (2006) elaborate, it 
is appropriate to drop these firms from our analysis 
because for these firms, we cannot find an appropri- 
ate "clone." Our second step is to create the sub- 
classes along the space of common support. Follow- 
ing Dehejia and Wahba (2002), we initially classified 
all observations in five equal-sized subclasses based 
on propensity scores. Because there are no firms in 
the 0-0.2 propensity score range, this meant start- 
ing with four subclasses (with inferiors of propensity 
scores at 0.2, 0.4, 0.6 and 0.8). Then we checked for 
any differences in propensity scores across CRM and 
non-CRM firms in each stratum. If we found any sig- 
nificant differences, we subdivided the stratum un- 
til we obtained a similar distribution of propensity 
scores and covariates in each stratum. This resulted 
in five strata that achieved propensity score and co- 
variate balance across CRM and non-CRM firms. 
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Table 2 

Covariate balance after subclassification on propensity score" 



Stratum 


Customer-facing IT systems 




IT investments 




Manufacturing 


CRM 


Non-CRM 


CRM 


Non-CRM 


CRM 


Non-CRM 


1 


3.65 


3.64 


2.02 


2.14 


0.73 


0.71 




(0.30) 


(0.23) 


(0.24) 


(0.23) 


(0.09) 


(0.08) 


2 


6.17 


6.12 


2.70 


2.61 


0.41 


0.45 




(0.16) 


(0.15) 


(0.24) 


(0.18) 


(0.05) 


(0.05) 


3 


8.79 


8.81 


3.50 


3.34 


0.36 


0.42 




(0.19) 


(0.23) 


(0.39) 


(0.51) 


(0.06) 


(0.08) 


4 


10.86 


10.36 


4.84 


4.55 


0.18 


0.14 




(0.14) 


(0.35) 


(0.46) 


(0.68) 


(0.04) 


(0.07) 



"The mean and standard deviation (in parentlieses) are given for eacli category. Note that none of the covariates has a 
statistically significant difference across the CRM and non-CRM firms within a stratum. 



Because of our limited sample size, the fifth stra- 
tum had only four non-CRM firms. Therefore, we 
combined the fourth and fifth strata to have a rea- 
sonable number of CRM and non-CRM units in each 
stratum. Thus, our final analysis contained four sub- 
classes. (We also tried three and six subclasses. How- 
ever, our attempts to use three and six subclasses 
did not succeed because this prevented a balanc- 
ing of covariates in each stratum, one of the critical 
conditions for subsequent analysis. Using more than 



six subclasses is not practical in our case because of 
the limited sample size that is typical in firm level 
research, unlike the common use of nine or ten sub- 
classes in individual level studies that have several 
thousand observations.) Table 2 shows the covari- 
ate balance and summary statistics across CRM and 
non-CRM firms within each stratum after subclas- 
sification based on propensity scores. We note that 
the covariate balance after propensity score strati- 
fication is better (see Figure 2 and Table 2) com- 



fiitratum 1 




ratum 2 



rbru ruru 



NO CRM CRM 
SVCE 



NO CRM CRM 



MFC 



NO CRM CRM 
SVCE 



NO CRM CRM 



MFG 



stratum 3 



stratum 4 



1^^ nil \\ 

NO CRM CRM NO CRM CRM NO CRM CRM NO CRM CRM 



SVCE 



MFG 



SVCE 



MFG 



I I mean of CUSTAPPS 



mean of ITPC 



Graphs by stratum 



Fig. 2. Covariate balance after stratification. 
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Table 3 

Rubin^s diagnostics for assessing covariate balance before and after stratification 



B 

Standardized 
difference in 
means 



Rl 

Ratio of r. . .111 

„ R2, variance ratio orthoeonal to the propensity score 

variances oi ° j- t- 

propensity score Manufacturing Customer-facing IT systems IT investments 



Before stratification 0.65 
After four subclass 

stratification 0.22 



1.34 
1.16 



0.91 
1.05 



0.93 
0.96 



1.97 
1.25 



pared to that before stratification (see Figure 1 and 
Table 1). 

We used diagnostics suggested by Rubin (2001) 
to further verify the success of the propensity score 
stratification in achieving covariate balance. We com- 
puted the following: (1) B, the standardized differ- 
ence in propensity score means in the CRM and non- 
CRM groups (recommended value less than 1/2), 
(2) iil, the ratio of the propensity score variances 
in the two groups (recommended value between 0.8 
and 1.25) and (3) i?2, the ratio of variances of the 
residuals of each covariate (MFC, CUSTAPPS and 
ITPC) after adjusting for the propensity score (rec- 
ommended value between 0.80 and 1.25). Table 3 
presents the results of these calculations and shows 
the efficacy of four subgroups based on propensity 
score stratification in achieving the covariate bal- 
ance, because the values of B, Rl and R2 fall within 
the recommended range after stratification. In par- 
ticular, stratification helped us bring B, Rl and R2 
closer to recommended values. 

Table 4 shows the average effect of CRM on mar- 
keting effectiveness within each stratum. The av- 
erage effect (over the four subclasses) of CRM on 
one-to-one marketing is 23 percentage points. This 
effect takes into account selection bias due to cor- 
relation between three observed variables (used in 

Table 4 
Causal effect of CRM 

CRM Non-CRM Effect of CRM on one-to-one 
treated control marketing effectiveness 



Stratum 


units 


units 


(standard errors) 


1 


23 


31 


0.35 (0.13) 


2 


90 


104 


0.21 (0.07) 


3 


72 


42 


0.12 (0.10) 


4 


97 


22 


0.31 (0.11) 


Average 


causal effect 




0.23 (0.05) 



the selection equation) and the treatment variable. 
This analysis suggests that the naive estimator D 
overestimated A in this application. One reason for 
this may be that the presence of a greater number 
of customer-facing IT systems is positively associ- 
ated with both use of CRM systems and one-to-one 
marketing effectiveness, causing Bias to be positive. 

We now compare the results obtained using the 
propensity score approach with those obtained us- 
ing a probit model. If there is a substantial dif- 
ference in the composition of treatment and con- 
trol groups, classical regression adjustment methods 
may not be reliable. Regression adjustment methods 
make strong functional form assumptions (e.g., con- 
stant treatment effect) based on extrapolation, and 
thus may not provide an estimate of the causal effect 
one hopes to estimate. Despite this consideration, 
researchers who wish to employ classical regression 
methods (such as ordinary least squares, log- linear 
models or logistic regression) to adjust for pretreat- 
ment covariates will find the propensity score e{x) 
to be a useful diagnostic tool that provides better 
visibility of the extent to which treatment and con- 
trol groups differ. For example, Rubin's diagnostics 
in row 1 of Table 3 (before stratification) suggest 
that in our CRM application, regression adjustment 
may reliably adjust for selection bias because the 
values of B, Rl and R2 are not significantly different 
from the recommended values (except for the value 
of R2 for IT investments, all other values are only 
slightly outside the recommended range). Therefore, 
we expect to see a similar estimated treatment ef- 
fect if we use a regression technique. To confirm this, 
we ran a probit model of one-to-one marketing ef- 
fectiveness on all covariates used in the propensity 
score model and the CRM indicator variable. We 
find that CRM applications are positively associ- 
ated with an improvement in one-to-one marketing 
effectiveness (/3 = 0.59, p < 0.001). Because the coef- 
ficients of probit models are not easily interpretable, 
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we also discuss the effect of a unit change in the 
CRM variable on the probability of gain in one-to- 
one marketing effectiveness. We find that CRM sys- 
tems are associated with a 23.37% increase in the 
probability of an increase in one-to-one marketing 
effectiveness, holding all other variables constant at 
their mean values. The results of this analysis sug- 
gest the usefulness of the propensity score approach 
as a first step to test for covariate balance before 
making subsequent regression adjustments. 

Although in this particular application, the ef- 
fect of CRM using a probit model is very close to 
the 23 percentage point increase estimated using 
the propensity score approach, there are two items 
worth noting. First, a probit analysis of y on Z and 
X provides a treatment effect that is conditional 
on the specific values of other covariates, while the 
propensity score method shown here provides an es- 
timate of the average (over X) causal effect. If one 
were to compare the effect of CRM on Y from both 
analyses on the probit scale and a linear additive 
model (in X) is appropriate, this distinction makes 
little difference. However, if the effect of CRM on 
Y varied depending on whether a firm is in the 
service or manufacturing sector, for example, then 
a probit model with linear additive terms in X is 
not appropriate. Even if this deficiency were cor- 
rected (say by including a CRM x MFC interaction 
term), the two analyses are not directly compara- 
ble, because they answer two different causal ques- 
tions. Second, in general, the results of regression 
based models may not sustain "causal" interpreta- 
tions, because they do not always ensure covariate 
balance across treatment and control firms (Dehejia 
and Wahba, 1999). 

4. DISCUSSION 

Our goal in this paper was to study the causal 
effect of CRM systems on one-to-one marketing ef- 
fectiveness. We find evidence that CRM systems 
have indeed caused an improvement in one-to-one 
marketing effectiveness of firms. The use of poten- 
tial outcomes reasoning and a propensity score ap- 
proach offers four primary advantages to estimate 
the causal effect of sharp treatments such as CRM 
systems in a way that is not fully or satisfactorily 
covered by traditional approaches such as linear re- 
gression or probit models. First, the potential out- 
comes framework permits a more precise articula- 
tion of causal questions in terms of a comparison be- 
tween two alternative states of the same firm (firm A 



with CRM versus firm A without CRM). As Rubin 
and Waterman (2006) elaborate, the causal effect 
is not a change in time of the observed outcome: 
instead, it is the difference between two potential 
outcomes, only one of which is observed. This clari- 
fies why "before and after studies" (Connolly, 2003) 
and quasiexperimental studies [e.g., event studies 
that generate useful insights by studying changes 
in stock returns following an announcement regard- 
ing an electronic commerce application (Dehning, 
Richardson, Urbaczewski and Wells, 2004; Dehning, 
Richardson and Zmud, 2003; Im, Dow and Grover, 
2001)] do not estimate "causal" effects in the sense 
described in this paper. 

Second, although in the firm level application used 
in this paper we did not have many covariates, in 
general the propensity score reduces the dimension- 
ality of observed covariates and avoids the problem 
of matching on multiple covariates. Because firms 
differ on multiple attributes, the propensity score 
approach offers a solution to the curse of dimen- 
sionality that has plagued much of the firm level re- 
search in the business value of IT literature, where 
researchers had to match on relatively few covari- 
ates to avoid losing degrees of freedom or statistical 
power. 

Third, because we are primarily interested in esti- 
mating the causal effect of CRM systems, the propen- 
sity score approach allows us to focus on estimating 
that effect without having to specify how other ob- 
served covariates, such as IT investments and in- 
dustry or other customer-facing IT systems that are 
correlated with CRM, may be related to the out- 
come variable one-to-one marketing effectiveness. In 
other words, the propensity score method allows re- 
searchers to escape strong functional form assump- 
tions (e.g., constant treatment effect or linear effect 
of covariates) that are implicit in regression analysis 
(Dehejia and Wahba, 1999; Rubin, 1997). 

Fourth and finally, the propensity score approach 
provides visibility of the extent to which CRM and 
non-CRM groups are similar to or different from 
each other based on observed covariates. Recall that 
while we started with a sample size of 487 firms, we 
had to drop six non-CRM firms because, for these 
firms, we did not find a matching CRM firm on 
support. We also ensured that CRM and non-CRM 
firms had a similar distribution of covariates within 
each stratum. In contrast, the extent to which CRM 
and non-CRM groups overlap is rarely, if ever, exam- 
ined when analysts use classical regression methods. 
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Also note that the propensity score approach makes 
a separation between the balancing (or propensity 
score) stage, in which the goal is to ensure that treat- 
ment groups are comparable, and the analysis stage, 
in which the desired causal question is addressed. In 
contrast, with traditional regression methods, the 
researcher attempts to adjust simultaneously for se- 
lection bias and model the causal phenomena of in- 
terest. Unfortunately, the consideration of repeated 
models in this fashion (with the objective of con- 
trolling for covariates) may lead to unreliable treat- 
ment effect estimates. With the propensity score ap- 
proach, on the other hand, a researcher is encour- 
aged to "mine" the data (propose different propen- 
sity score models, consider different subclassifica- 
tions of the propensity score, etc.) with the objec- 
tive of achieving balance on observed pretreatment 
covariates. Since this first stage of analysis does not 
involve outcome data, there is less concern about 
inappropriately slanting the results of the analysis. 
Together, these advantages of the propensity score 
approach provide a powerful motivation for greater 
use of this approach to study causal effects not only 
in CRM and electronic commerce research, but also 
in other fields of management research. 

Although this study extends previous research on 
the effect of CRM systems on firm performance 
(Boulding et al., 2005; Mithas, Krishnan and For- 
nell, 2005; Srinivasan and Moorman, 2005), several 
opportunities for future research remain. First, busi- 
ness value researchers have hinted at the possibil- 
ity that, depending on their degree of preparedness, 
firms may benefit differentially from their IT sys- 
tems (Lucas, 1993). Future research should system- 
atically explore the issues related to treatment effect 
heterogeneity in the CRM context. Likewise, find- 
ings of observational studies, no matter how care- 
fully done, are subject to the bias due to omission 
of unobservable variables and thus point to the need 
for a sensitivity analysis to assess the degree of po- 
tential bias. Our related work takes initial steps to 
address these issues of treatment effect heterogene- 
ity and sensitivity analysis to assess the impact due 
to selection on unobservables at the firm and indi- 
vidual levels (Mithas and Krishnan, 2004a, b). Sec- 
ond, much of the CRM related work pertains to the 
business-to-consumer (B2C) domain and there are 
very few studies that have studied the phenomenon 
of CRM implementation in the business-to-business 
(B2B) domain (Mithas, Jones, Krishnan and For- 
nell, 2005). Since B2B transactions constitute a much 



greater proportion of economic activity compared 
with B2C transactions, it will be useful to under- 
stand the antecedents and consequences of CRM in 
the B2B context. 

Going beyond the CRM context, we suggest that 
the potential outcome framework applies more broadly, 
from firm level questions to individual level and econ- 
omy level causal questions such as those mentioned 
in the Introduction section. It is encouraging to note 
increasing use of the language of potential outcomes 
in academic research. For example, Bhagwati, 
Panagariya and Srinivasan (2004) seem to suggest 
that an economy level causal effect of offshoring on 
wages and jobs requires the use of the potential out- 
comes framework to pose the correct question. They 
note, "Forrester does not explain whether the pre- 
diction is that the U.S. economy will have 3.3 mil- 
lion fewer jobs in 2015 than it would otherwise have 
had because of outsourcing. . . or whether the predic- 
tion is that outsourcing will cause 3.3 million U.S. 
workers to shift from jobs that they might otherwise 
have had into different jobs. . . " (Bhagwati, Pana- 
gariya and Srinivasan, 2004, page 97). Similarly, at 
the individual level, researchers are interested in in- 
vestigating the causal effect of the MBA degree for 
IT professionals. Use of the propensity score ap- 
proach in this context can be informative because 
it shifts attention from estimation of causal effects 
based on a "before and after" type analysis to the 
potential outcome analysis used in this paper (Con- 
nolly, 2003; Mithas and Krishnan, 2004b; Pfeffer and 
Fong, 2003). 

We acknowledge some limitations and challenges 
in the use of a propensity score approach in our 
CRM application. First, the most important lim- 
itation of this paper is that we have a relatively 
small set of observed covariates as is typical in firm 
level studies (Mithas, Krishnan and Fornell, 2005). 
Therefore, we are unable to take complete advan- 
tage of the power of the propensity score approach 
dimension reduction tool. Second, we focus on 
binary treatment and outcome variables in this ap- 
plication. However, the propensity score approach 
has been extended for ordinal, categorical and arbi- 
trary treatments, and is also applicable to contin- 
uous outcomes. Third, as in other studies (Rosen- 
baum and Rubin, 1983a), we did not specifically ac- 
count for the uncertainty in estimation of propen- 
sity scores in our estimation of standard errors of 
the overall causal effect. However, recent work sug- 
gests several methods to compute standard errors 
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considering various sources of uncertainty, and finds 
that standard errors computed using different meth- 
ods often give similar results (Agodini and Dynarski, 
2004; Benjamin, 2003). 

To conclude, this paper assessed the causal ef- 
fect of CRM systems on one-to-one marketing ef- 
fectiveness using a propensity score approach. We 
provided a detailed procedure to carry out the use 
of propensity score matching to assess the causal 
effect of CRM systems using a real data set from 
electronic commerce research. This paper illustrates 
the usefulness of the potential outcome approach 
and propensity score stratification to pose causal 
questions and to estimate causal effects of manage- 
rial and electronic commerce related IT interven- 
tions from observational data. We hope that this 
paper will encourage electronic commerce and man- 
agement researchers to utilize this approach to an- 
swer their substantively interesting research ques- 
tions in causal terms. 
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